14 research outputs found

    A probabilistic model for gene content evolution with duplication, loss, and horizontal transfer

    Full text link
    We introduce a Markov model for the evolution of a gene family along a phylogeny. The model includes parameters for the rates of horizontal gene transfer, gene duplication, and gene loss, in addition to branch lengths in the phylogeny. The likelihood for the changes in the size of a gene family across different organisms can be calculated in O(N+hM^2) time and O(N+M^2) space, where N is the number of organisms, hh is the height of the phylogeny, and M is the sum of family sizes. We apply the model to the evolution of gene content in Preoteobacteria using the gene families in the COG (Clusters of Orthologous Groups) database

    Inferring HIV-1 transmission networks and sources of epidemic spread in Africa with deep-sequence phylogenetic analysis

    Get PDF
    To prevent new infections with human immunodeficiency virus type 1 (HIV-1) in sub-Saharan Africa, UNAIDS recommends targeting interventions to populations that are at high risk of acquiring and passing on the virus. Yet it is often unclear who and where these ‘source’ populations are. Here we demonstrate how viral deep-sequencing can be used to reconstruct HIV-1 transmission networks and to infer the direction of transmission in these networks. We are able to deep-sequence virus from a large population-based sample of infected individuals in Rakai District, Uganda, reconstruct partial transmission networks, and infer the direction of transmission within them at an estimated error rate of 16.3% [8.8–28.3%]. With this error rate, deep-sequence phylogenetics cannot be used against individuals in legal contexts, but is sufficiently low for population-level inferences into the sources of epidemic spread. The technique presents new opportunities for characterizing source populations and for targeting of HIV-1 prevention interventions in Africa

    High percentage of undiagnosed HIV cases within a hyperendemic South African community: A population-based study

    No full text
    Background Undiagnosed HIV infections could undermine efforts to reverse the global AIDS epidemic by 2030. In this study, we estimated the percentage of HIV-positive persons who remain undiagnosed within a hyperendemic South African community. Methods The data come from a population-based surveillance system located in the Umkhanyakude district of the northern KwaZulu-Natal province, South Africa. We annually tested 38 661 adults for HIV between 2005 and 2016. Using the HIV-positive test results of 12 039 (31%) participants, we then back-calculated the incidence of infection and derived the number of undiagnosed cases from this result. Results The percentage of undiagnosed HIV cases decreased from 29.3% in 2005 to 15.8% in 2011. During this period, however, approximately 50% of the participants refused to test for HIV, which lengthened the average time from infection to diagnosis. Consequently, the percentage of undiagnosed HIV cases reversed direction and steadily increased from 16.1% to 18.9% over the 2012–2016 period. Conclusions Results from this hyperendemic South African setting show that the HIV testing rate is low, with long infection times, and an unsatisfactorily high percentage of undiagnosed cases. A high level of repeat HIV testing is needed to minimise the time from infection to diagnosis if the global AIDS epidemic is to be reversed within the next two decades

    Polymorphisms of large effect explain the majority of the host genetic contribution to variation of HIV-1 virus load

    Get PDF
    Previous genome-wide association studies (GWAS) of HIV-1–infected populations have been underpowered to detect common variants with moderate impact on disease outcome and have not assessed the phenotypic variance explained by genome-wide additive effects. By combining the majority of available genome-wide genotyping data in HIV-infected populations, we tested for association between ∌8 million variants and viral load (HIV RNA copies per milliliter of plasma) in 6,315 individuals of European ancestry. The strongest signal of association was observed in the HLA class I region that was fully explained by independent effects mapping to five variable amino acid positions in the peptide binding grooves of the HLA-B and HLA-A proteins. We observed a second genome-wide significant association signal in the chemokine (C-C motif) receptor (CCR) gene cluster on chromosome 3. Conditional analysis showed that this signal could not be fully attributed to the known protective CCR5Δ32 allele and the risk P1 haplotype, suggesting further causal variants in this region. Heritability analysis demonstrated that common human genetic variation—mostly in the HLA and CCR5 regions—explains 25% of the variability in viral load. This study suggests that analyses in non-European populations and of variant classes not assessed by GWAS should be priorities for the field going forward
    corecore